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ABSTRACT 

In this thesis, we discuss the development of the neces- 
sary tools for the performance evaluation of a multi-backend 
database system, known as MDBS. The basic motivation of the 
mutlti- backend database system (MDBS) is to develop an 
architecture which spreads the work of the database system 
among multiple backends. It is a major aim of this system 
to allow capacity growth by the use of additional disk 
drives and performance improvement by the use of additional 
backends. However, to verify the design and implementation, 
it is necessary to test the capability of MDBS in capacity 
growth and performance gain. 

Three tools for the performance and capacity tests are 
investigated. The first tool is the file generation package 
which creates test files for any artificial database. The 
second tool is the database lead subsystem which loads the 
artificial database into MDBS. The third tool is the 
request generation package. This package creates test 

requests to query MDBS. 

The following methodology is used to create an effective 
tool. First, the properties of an ideal tool are described. 
Then available existing programs are reviewed and evaluated 
to determine which program best meets the desired features. 
Lastly, the programs are upgraded to ensure that they are 
compatible with the current implementation, and meet the 
desired features. 

The main goal is to develop the necessary tools to 
generate tests in measuring the extensibility of MDBS, i.e., 
how does MDBS perform as more backends are added? 

Performance is expected to improve (maintain) as the number 
(size) of the backends (database) is increased. 
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I- 15 IN TEODDCTI ON 



This chapter presents a brief review of the multi - 
backend database system (MDBS). First, the physical 

arrangement of MDBS is presented. This is followed by a 

presentation of the process structure of MDBS. Lastly, the 
actions taken in servicing reguests, both insert and non- 
insert reguesrs, are reviewed. References are cited for the 
interested reader in order to gain a more detailed under- 
standing cf MDBS. 

A. THE HQLTI-BACKEBD DATABASE SYSTEM 

The multi- backend database system (MDBS) uses one mini- 
computer as the master or controller, and a varying number 
of minicomputers and their disks as slaves or backends. 
MDBS is designed to provide database growth and performance 
enhancement by the addition cf identical backends. No 
special hardware is required. The backends are configured 
in a parallel fashion. A new backend may be added by simply 
replicating the existing software on the new backend, thus 
avoiding reprogramming efforts. A prototype MDBS has been 
completed in order to carry out the design verification and 
performance evaluation developed in [Ref. 1] and [Ref. 2]. 
The implementation efforts are described in [Ref. 3] through 
[Ref. 5]. 

The equipment configuration of the system is shown in 
Figure 1.1. The host computer is connected to MDBS through 
the controller. The backends are connected to the 

controller through a broadcast bus. When nhe controller 
receives a request from the host, it delivers the request 
to all backends simultaneously over the broadcast bus. 
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Figure 1. 1 




since the data is distributed across all backends, all back- 
ends can execute a request in parallel. 

The division cf labor between the controller and the 
backends is illustrated through the process structure of 
Figure 1.2. The MDBS controller handles three functions. 
The re ques t pre pa r ation function prepares a request for 
transmission to the backends. The in sert i nf ormation g ener - 
a tion fun cti on processes the insert requests which require 
additional information used by the backends. The Rost 
p rocess ing fu nc tion handles the work necessary when the 
replies are returned to the controller from the backends but 
before reaching the host. 

The backends in MDBS carry out three different func- 
tions. The directo ry manage me n t f unct ion performs 

descriptor search, cluster search, address generation, and 
directory table maintenance. The rec ord processing function 
performs record storage, record retrieval, record selection, 
and attribute- value extraction of the retrieved records. 
The concurren cy co ntro 1 fun c tion performs operations to 
ensure that the concurrent and interleaved execution of the 
user requests will keep the database consistent. 

Before proceeding to describe the sequence of actions 
required during a request servicing, some terminology is 
presented as a review. The smallest unit of data is a 
keyword, which is an at tribute- value pair. Information is 
stored in terms of records, which are made up of keywords 
and a record body. A predicate is of the form (attribute, 
relational operator, value). A query is any Boolean expres- 
sion of predicates. Records are logically grouped into 
clusters based on the attribute values and the attribute- 
value ranges in the records. Internally, the values and 
value ranges are called descriptors. For the user, these 
attribute values are termed keywords. Each descriptor is 
identified by a descriptor id to save computing time and 
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Figure 1.2 PHOCESS STBOCTDRE OF MDBS. 
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memory space, A ptespecif ied set of requests is referred zo 
as a transaction. 

B, REQUEST BXECOTIOH 

This section describes the sequence of actions taken by 
MDBS in carrying out a request. First, the insert request 
will be discussed. Then the non-insert requests will be 
described. Non-insert requests are requests for deletion, 
retrieval, or update. 

Ac ti ons for I nse rt R equests 

The sequence of actions for an insert request is 
shown in Figure 1.3. A request from the host machine enters 
the Request Preparation process. Request Preparation broad- 
casts the number of requests in the transaction to Post 
processing in order to determine when a transaction is 
completed. Request Preparation may send an error to Post 
Processing if there is a syntax error in the request. When 
a transaction is completed Post Processing sends the results 
to the host machine. Request Preparation then broadcasts 
the request to Directory Management. Each backend finds the 
descriptor ids associated with the request. The backends 
then exchange descriptor id information. 

After receiving the descriptor ids from the other 
backends. Directory Management sends the cluster id to 
Insert Informaticn Generation. Insert Information 

Generation then determines which backend is to store rhe 
record. The selected backend determines the address of the 
new record and stores it. The other backends discard the 
record. Finally, Record Processing sends an action- 

completed message to Post Processing, which in turn informs 
the host. 
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Figure 1.3 SEQUENCE OF ACTIONS FO F AN INSERT REQUEST. 
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2* A ction s f or Non^ i nse rt Reque sts 

The sequence of actions for a non-insert request is 
shewn in Figure 1.4. The actions f cr a retrieve will be 
discussed only, since the other types cf requests are quite 
similar. A request from the host machine enters the Request 
Preparation process. Request Preparation sends the number 
of requests in the transaction to Post Processing in order 
to determine when a transaction is completed. Request 
Preparation may send an error to Post Processing if there is 
a syntax error in the request. When a transaction is 
completed. Post Processing sends the results to the host 
machine. Request Preparation then broadcasts the request to 
Directory management. Each backend finds the descriptor ids 
associated with the request. The backends then exchange 
descriptor id information. 

After receiving the descriptor ids from the other 
backends. Directory Management determines the cluster ids. 
Lastly, Directory Management determines the addresses of the 
records of the identified clusters. Record Processing gets 
the records from secondary storage and extracts the neces- 
sary information. If aggregate operators, for example, the 
average, are specified in the retrieve request, they are 
applied at this time. The partially aggregated values are 
sent to Pest Processing. Post Processing sends the results 
to the host following any further aggregate operations. 

This concludes the review of MDBS. Attention is now 
turned toward performance issues of this system in the 
following chapter. 
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SEQUENCE OF ACTIONS FOR A NON-INSEHT REQUEST 



II. PERFORH &NCB EV&L D ATIOH 



A. THO ?IERS OF FEBFOBHAHCE HEASOREHENT 

Now that th<5 MDBS has been described, it is reasonable 
to ask "how does one determine the performance of such a 
system?" There are two viewpoints of performance evalua- 
tion. The first is the macroscopic viewpoint in which the 
key performance measurement is the relative response time. 
The second viewpoint is the microscopic viewpoint. This 
viewpoint is concerned with measuring the times needed to 
perform various subtasks which are carried out in servicing 
a request. In [Ref. 6], the motivation for the macroscopic- 
measurement is provided. This chapter is concerned with 
describing the performance issues which arise when using the 
macroscopic viewpoint. Thus in testing the MDBS, the macro- 
scopic viewpoint will be used before proceeding to the 
microscopic viewpoint. 

B. CRITERIA FOR PERFORHAHCE BVALOATION AND TOOL SELECTION 

1 • Ma cros cop ic V iewpoin t 

As stated above, with the macroscopic viewpoint the 
key performance measurement is the relative response time. 
That is, the concern lies mainly with the affect of various 
changes to the system on the response time. These changes 
and therefore their relative response times are prompted by 
the variables described in the following section. 



2* P erfor m ance I ssu es 

The macroscopic viewpoint is 
four categories of variables and obs 
the relative response time. These v 
configuration variables, cluster 
request construction variables, and s 
The system configuration va 
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database remains constant and th 
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of descriptors on any attribute incre 
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Thus it can be seen that several variables influence 
the performance of MDBS. This is not an all-inclusive list. 
However, the list will serve as a basis for developing the 
desired properties of each performance tool. Each tool will 
be discussed along with its desired properties in the 
following sections, 

C- DESIRABLE PROPERTIES OF THE TEST FILE GEHERATIOH PACKAGE 

The purpose of the file generation package is to create 
an artificial database which will eventually be loaded into 
MDBS. This is the first tool tc be used for the evaluation. 
Several parameters are likely to be varied in the light of 
the performance issues. Their desired properties are as 
follows. The input parameters to such a package may 

include: file size in number of records per file, 

attribute-value size in bytes of storage, record size in 
number of attributes values, dara types of attribute values, 
and database size in number of files per database. In addi- 
tion, parameters must indicate whether values of attributes 
are taken from random functions, or from predetermined sets, 
and whether uniqueness of values is desired. 

D. DESIRABLE PROPERTIES OF THE DATABASE LOAD SOBSYSTEM 

The database load subsystem is responsible for taking 
the files created by the file generation package and for 
properly loading the files into MDBS. In the process of 
loading the database, the database load subsystem must also 
create the necessary tables used in directory management. 

The database load subsystem must be designed so that the 
performance evaluation may utilize various cluster formation 
variables and storage variables with minimum effort. The 
cluster formation variables and storage variables with which 
the performance may be concerned include the following. The 
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performance may be expected to depend upon whether the 
number of descriptors (attributes) is large or small. 
Certain ly» when entering a large number of descriptors 
(attributes) , the chance for error in this menial task is 
great. Therefore, the ease of specifying the descriptors 
(attributes) must be guaranteed. The variation of cluster 
size may affect performance. The cluster size is a function 
of the number of descriptors, the size of the input files, 
and the values used in the attribute fields. Therefore, 
these three parameters should be entered independently. The 
data placement strategy, i.e., how records are distributed 
across the backends, also affects performance. while simu- 
lation studies described in [Ref. 1] and [Ref. 2] show that 
the track-spli tting-with-random-plac ement strategy is the 
most desirable, the ability to change the placement strategy 
will provide a means cf confirming these studies. 

E. DESIRABLE PROPERTIES OF THE REQUEST GENERATION PACKAGE 

The reguest generation package is concerned with 
creating and executing test requests. The request formation 
variables will be altered by the performance evaluation team 
in this performance evaluation tool. The request formation 
variables will be changed in crder to vary the following: 
the percentage of the types of requests (retrieve, update, 
insert, or delete) , the percentage of aggregate operators 
(ave, max, min, sum, and count) in retrieve requests, the 
complexity of the request query (A simple query will consist 
of one to two predicates, and a complex query will consist 
of ten to fifteen predicates) , the order of the predicates 
appearinq in the request, and the number of attributes to be 
projected in the retrieve request. 
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The request generation package must also possess the 
ability to allow the following; vary the length of the 
transaction to determine its effect on system performance, 
tag requests with user identification in order to test 
concurrency control, retrieval of a record defined over the 
null descriptor, execute a retrieve request where the entire 
cluster is stored at one backend, and compare the above 
performance with a retrieve request where the cluster is 
distributed across all backends. 

It is now appropriate to proceed to the details of each 
of the above three tools. In the following chapter the test 
file generation package is discussed, chapter IV deals with 
the details of the database load subsystem, and Chapter V 
develops the test request generation package. 
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III. THE TE^ PILE 6 EHE RAT ION PACKAGE 



In this chapter, we discuss the test file generation 
package development. In the first two sections, we review 
the purpose and desired properties of the package. In the 
next two sections, we discuss how the basic program was 
selected from existing file generation tools. Finally, in 
the last two sections we discuss the upgrading of the 
selected program and future enhancements which will further 
aid the performance evaluation team. 

A. THE POBPOSE 

The first set cf performance evaluation experiments will 
use test data which is generated by a program in the form as 
specified by the experimenter. This process may be viewed 
in three steps. The first step consists of defining the 
structure cf the files to be generated. The second step 
determines where the values for the specified attributes 
will be generated. Lastly, the files are generated and 
stored for future use. 

B. DESIRED PROPERTIES 

The input parameters to such a package may include: 
file size in number of records per file, attribute size in 
bytes of storage, record size in number of attribute values, 
data types of attributes, database size in number of files 
per database, whether values of attributes are taken from 
random functions or are selected from predetermined sets, 
and whether uniqueness of values is desired. 
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C. EXISTING PBOGBIBS 



Two programs were reviewed in order to determine which 
possesses the largest number of desired properties and still 
would require the least effort to ensure sjstem compati- 
bility with the current version of MDBS. Tlje first of the 
twc programs was originally designed in [Bef. 3]. The 
second was a latter attempt to simplify the test file gener- 
ation package. 

1. Original Tes t File Genera tion P ack age 

In this program the test data is generated and 
stored in files. Several character istcs of the fils are 
specified by the experimenter. Each file is given a name. 
The data in the records is specified in a fixed number of 
attribute-value pairs. The type of data in the attributes 
is integer, string, and floating-point numbers. These 
values are generated in either predetermined files, called 
sets, created bj the experimenter, or are randomly generated 
by separate functions. Only a uniform distribution of the 
various data types is available. This program contains all 
of the desired properties stated above, except the ability 
to guarantee uniqueness of the records created. 

2. The ^ortened ^st File Gener ation Pa ckage 

This program was written in order to reduce the 
complexity of the original test file generation package. 
Many of the features of the original program remain intact. 
Two important differences exisr. The shortened version only 
allows the use of predetermined sets of values to be used, 
therefore not allowing randomly generated values. The 
second difference is the fact that the files generated must 
be of length of less than or equal to 10,000 records. An 
advantage of the shortened version is that it is combined 
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with the shortened database load program, which is discussed 
in the following section. 

D- SELBCTIOH OF THE TEST FILE GENEBATION PACKAGE 

The shortened version of the test file generation 
package was selected initially as the file generation tool. 
MDBS is currently undergoing a change in the version of the 
compiler used. In an attempt to keep the conversion of MDBS 
simple, the shortened version was chosen. This version 
allowed a rapid conversion. However, only user defined sets 
of values are selected for the attribute values. This is 
considered a disadvantage. Perhaps the overriding consider- 
ation in the selection of the shortened version was the fact 
that its associated database load subsystem was much 
simplier. The discussion of this subsystem is provided in 
detail in the following section. 

E. THE 0P6BADIHG PBOCESS 

The upgrading process for the shortened version of the 
test file generation package was relatively simple. The C 
compiler originally used in the implementation was an older 
version. The new version is being used by MDBS. Several 
minor compiler differences with respect to acceptable syntax 
were rapidly fixed. 

F- FOTOBE IHPBOVEHEHTS 

Because the shortened version possesses all but one of 
the desired properies discussed in chapter II, only one 
future change is anticipated. 

Two approaches which provide the shortened version with 
the capability of randomly generating values exist. The 
first of these alternatives includes adding the functions to 
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the program with the additional user interface to select 
these as options. The second alternative is to adapr the 
original test file generation package to be comparible with 
the shortened database load. The task would be simplified 
by choosing the first alternative. 

This concludes the discussion of the test file genera- 
tion tool. In the following chapter, we discuss the proper- 
ties of the selected database load subsystem. 
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17. TBE DAT A MSB LOAD S OBSYST EH 



In this chapter, we discuss the database load subsystem 
development. In the first two sections, we review the 
purpose and desired properties of the subsystem. In the 
next two sections, we discuss how the basic program was 
selected from existing database load tools. Finally, in the 
last two sections, we discuss the upgrading of the selected 
program and future enhancements which will further aid the 
performance evaluation team. 

A. THE POBPOSE 

The database load subsystem is a software tool used to 
designate an input source file and to create a database from 
that source file. It also allows several related files to 
be consolidated into one database if desired. The first 
phase in the database load subsystem is to define the input 
files and the database. The second phase consists of 
constructing various directory management tables. Lastly, 
the records are distributed across the backends. 

B. DESIRED PROPERTIES 

The database load subsystem must be designed so that the 
performance evaluation may utilize various cluster formation 
variables and storage variables with minimum effort. The 
performance may be expected to depend upon whether the 
number of descriptors (attributes) is large or small. The 
ease of specifying the descriptors (attributes) must be 
guaranteed. The variation of cluster size may affect 
performance. The cluster size is a function of the number 
of descriptors, the size of the input files, and the values 
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used in the attribute fields. These three parameters should 
be entered independently. The data placement strategy, 
i.e., how records are distributed across the badcends, also 
affects performance. The ability to change the placement 
strategy will provide a means of confirming simulation 
studies . 

C. EXISTING PBOGBXaS 

Two database lead subsystems were reviewed. In this 
section the merits of both of the existing programs are 
discussed. The original database load subsystem is covered 
first, then a shortened version of the database load 
subsystem is evaluated. 

1 . The Ori gina l Dat a bas e Load Su bsyst em 

The original database load subsystem was first 
designed at the beginning of the implementation stage of 
MDBS. The process is viewed as four logical phases. The 
first phase is the database definition phase, in which the 
user specifies various characteristics of existing source 
files and the characteristics of the database to be created. 
The second phase is the record preparation phase, in which 
the data is read from the input files and prepared for 
loading. The third phase is the record clustering phase, in 
which the prepared records are sorted into clusters. The 
last phase is the record and table distribution phase. This 
phase distributes the records and the directory management 
tables to the backends. 

Shortened Dat abase Load S ubsystem 

As stated in Chapter II, the shortened database load 
subsystem is much simpler than the original database load 
subsystem. This implementation can be viewed as two phases. 
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The first phase is the directcry table construction phase, 
in which specified database parameters are read from 
existing files and the directory tables are constructed. 
The second phase is the record distribution phase. In this 
phase the records are distributed to the bachends by using 
insert reguests. Thus this subsystem uses currently 
existing directory management functions to load rhe database 
records. 

D. THE SEIECTION Of THE OATAB£SE LOAD SOBSYSTEH 

Several disadvantages to the original database load 
program exist. Since it was created at the inception of 
MDBS design, it possessed many system incompatibilities with 
the current version of MDBS. Once again the large size of 
the program posed a significant maintenance problem with 
respect to the conversion of the system to the new compiler. 
For these reasons this program was not selected. 

The shortened version of the database load subsystem was 
chosen as the basis for the database load tool. This was 
due to the fact that it used existing directory management 
code and that it was much simpler to understand and thus 
maintain. 

E. THE OPGRADIHG PROCESS 

In this section, we now discuss the upgrading of the 
shortened version of the database load subsystem. A discus- 
sion of the communication among processes is presented. 
Then the changes to the database load subsystem are 
discuss ed . 
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1 • M essag e Passing 

In order to load the current version of MDBS, it is 
necessary to change the database load subsystem so that it 
could communicate with the backend process of directory 
management. The database load subsystem is implemented as a 
separate process in the controller. A brief discussion of 
message passing in MDBS is presented below. 

a. Message Passing Within a Backend 

The backends are supported by PDP-11/44S running 
under RSX-11M operating system. The inter-process- 

communication facility is the shared access to physical 
memory. Suppose process X wants to send a message to 
process Y. X will copy the message into the shared area. 
Then X tells the operating system to send the address of the 
message to process Y. When Y is ready to receive a message, 
it gets the address of the message from the operating 
system's queue of such addresses. Process Y then copies the 
message into its own memory space. 

b. Message Passing Within the Controller 

The MDBS controller is a VAX-11/780 using the 
VMS operating system. The inter-process communication 
facility is the mailbcx. The mailbox is a software input/ 
output device. If process X wishes to send process Y a 
message, process X first issues a send command to process 
Y's mailbcx. When process Y issues the read command on its 
mailbox it will be given the message sent by process X. The 
mailbox can queue several messages. 
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c. Message Passing Between Computers 

Communication between computers in MDBS is 
achieved by using a tims-d ivision-multiplexed bus called the 
parallel communication link (PCI) . Two interface processes 
to the PCI are used in each computer. The first process, 
called put_PCL, pets the message to be sent to the other 
computers on the PCI. The second process, called get_PCL, 
receives the message from the bus and then passes the 
message to the appropriate process. PCLs are presently used 
to simulate the broadcast bus and will be replaced physi- 
cally by a broadcasting bus later. 

2 • D irect ory T ables 

Several directory tables exist in order to process 
requests. In this section the logical descriptions of such 
tables are discussed. This will allow some insight into 
what kind of messages must be sent during the loading of the 
database. 

The Attribute Table (AT) contains a list of the 
directory attributes and a pointer to the descriptors 
defined on these attributes. The AT is located at each 
backend. The Descriptor-to-Bescri ptor-Id (DDIT) Table 
contains the descriptors and their corresponding descriptor 
ids. Each section of the DDIT is associated with a direc- 
tory attribute and contains the descriptors defined on that 
attribute. The DDIT is located at each backend. Since 
type-C sub-descriptors are created dynamically as new 
records are inserted, the type-C attributes must be recorded 
in a table called the lype-c-Descriptor-Table (TCDT) . The 
TCDT is located in the controller. When an insert request 
contains a record with a type-C attribute and the value of 
the attribute does not appear in a type-C descriptor, a new 
type-C descriptor will be created by the Insert Information 



Generation process. This process will then record the 
descriptor in the TCDT. Thus all directory attributes and 
their corresponding descriptors are sent to the backend's 
Directory Hanagement processes. All type-C attributes are 
also sent to the Insert Information Generation process in 
the controller. 

3 • S pecif ic Upgrade s 

The database load subsystem program was changed by 
allowing it to ccmmunicate with the backends in order to 
load the database to the backends. In order zo distribute 
the directory management tables to all backends, the data- 
base load subsystem must be given its own mailbox and access 
to the directory management physical areas located in the 
backends. All of the functions which create the directory 
management tables were moved tc the backends and appropri- 
ately placed in the directory management processes. Data 
necessary to construct these tables was passed to the back- 
ends by using messages containing codes which indicate the 
type of action to be taken. Because the backends can 

construct the tables in parallel, this did not significantly 
burden the database lead process. In order to support the 
message passing ability, send and receive routines specific 
to the database lead process were wrixten. Figure 4. 1 
illustrates the inter-process communication involved with 
the directory table ccnstructicn phase. 

In order to load the records into the database, 
communication between the request preparation process 
(located in the controller) and the database load subsystem 
was established. This allowed the database load subsystem 
to send the insert requests directly to request preparation. 
Thus the database load subsystem was given access to the 
request preparation mailbox. It was also necessary to send 
the Insert Information Generation process all of the type-C 
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attributes for insertion into the TCDT. Figure 4.2 shows 
the inter-process communication of the record distribution 
phase. 

The following is a summary of the types of messages 
which were added to the database load subsystem: 



Message type: 
Source: 
Destin at icn: 
Explanation: 



Message type: 
S ource: 
Destination: 
Explanation: 



Message type: 
Source: 
Destination: 
Explanation: 



Message type: 
Source: 
Destination: 
Explanat ion: 



Message type: 
Source: 
Destination: 
Explanation: 



(1) Create AT 
Database Load (CEL) 

Directory Management 

This message creates an AT for 
the given database name. 

(2) Add Attribute to AT 
Database Load (DEL) 

Directory Management 

This message adds an attribute 
to the AT for the given database. 

(3) Add Descriptor to DDIT 
Database Load (CBL) 

Directory Management 

This message adds a descriptor 
to the DDIT for the given database. 

(4) Add the end cf descriptor flag 
Database Load (DEL) 

Directory Management 

This message adds the flag to signal 

the end of the descriptor list. 

(5) load type-C 
Database Load (EEL) 

Insert Information Generation 

This message passes the type-C attribute 

to IIG for entry into the TCDT. 
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Hessage type: 
Source: 
Destination: 
Explanation: 



(6) Insert record 
Database Load (DEL) 

Bequest Preparation 

This message sends the record to be 

loaded to HP. 



Message type: 
S ource: 

Destin at ion: 
Explanation: 



(7) Responses 
Directory Management and 
Insert Information Generation 
Database Load 

This group of messages informs DBL of 
action that is actually carried out as 
requested by the above messages from 
DBL. They also include error messages. 



Thus for each of the messages (1) through (6) , a type (7) 
message is sent to the Database load subsystem. This 
concludes the upgrading of the database load subsystem. 



F- FOTORE IMPBOVEBEITS 

The database lead subsystem contains all of the desired 
properties discussed above with the exception of the ability 
to change the data placement strategy. Due to the manner in 
which the database is loaded, this would require a change in 
the directory management process. Further research is 
required to investigate the ramifications of changing the 
directory management process. This feature should be 
delayed until the system conversion to the new compiler is 
completed. 
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T. THE TEST EBflOEST GENERA TIP H PACKA^ 



In this chapter, we discuss the test request generation 
package development. In the first two sections, we review 
the purpose and desired properties of the package. In the 
next two sections, we discuss how the basic program was 
selected from existing request generation tools. Finally, 
in the last two sections, we discuss the upgrading of the 
selected program and future enhancements which will further 
aid the performance evaluation team. 

A. THE PURPOSE 

The purpose of the test request generation package is to 
provide an easy means of creating a list of test requests 
which will be executed in order to test MDBS. The package 
also aids the evaluation team in executing the list of 
requests. The list of requests are saved in a file for 
future use, in crder to avcid regenerating the list of 
requests. 



B. DESIRED PROPERTIES 



Recall that the test request generation package permits 
the request formation variables to be altered by the evalua- 
tion team. This allows the following to be varied; the 
percentage of the types of requests (retrieve, update, 
insert, or delete) , the percentage of aggregate operators 
(ave, max, min, sum, and count) in retrieve requests, the 
complexity of the request query, the order of the predicates 
appearing in the request, and the number of attributes to be 
projected in the retrieve request. 



37 



The request generation package must also possess the 
ability tc allcw the following modifications: vary the 

length of the transaction, tag requests with user identifi- 
cation, retrieve a record defined over the null descriptor, 
and execute a retrieve request in which the entire cluster 
is stored at one backend and compare the performance with a 
request which retrieves records from a cluster which is 
stored across all backends. 

C. EXISTIH6 PBOGBAHS 

Two existing programs were reviewed in order to select 
the one which best fits the desired properties and is compa- 
tibile with the current version of MDBS. Both programs 
implement the test request generation package in the 
controller. The next section discusses version A of the 
test request generation package. Version A was originally 
designed at the commencement of the implementation of MDBS. 
Version B was a later version. 

1 . Ve rsio n A 

Version A may be described as a package which aids 
the user in developing a list of requests. The user is 
guided through the construction of one request at a time. 
The program ensures that the syntax is correct. The intent 
of this method is to generate a small number of requests 
which are thoughtfully devised in order to test specific 
features cf MDBS. This program also assumes that one user 

will execute only one request at a time. The user is 
allowed the following options when using this test request 
generation package: generating a list of requests for later 

use, retrieving a list of requests to be executed in any 
order, modifying an existing list, or executing a list of 
requests. 
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2. Ve rsio n B 

Version B is a follow-on package to Version A. 
therefore possesses all of the features contained in Version 
A. It should be noted that Version B adds the ability to 
use the concept of transactions. Recall that a transaction 
is a group of one cr more requests. Thus the requirement of 
executing only one request at a time is removed. 

D. THE SEIECTIOH OF THE TEST BEQUEST GENERATION PACKAGE 

Because Version B contains all the features of Version 
A, Version B was selected as the test request generation 
package. Because this version arrived at the current imple- 
mentation site of MDBS rather late in the review of perform- 
ance evaluation tools, many of the desired features must be 
left for future development. This does not detract from the 
usefulness of the test request generation package as it 
stands. 

E. THE UPGRADING PROCESS 

The majority of the upgrading accomplished on the test 
request generation package consisted of ensuring that the 
syntax discrepancies due to compiler differences were 
removed, A reorganization of the file location of MDBS 
resulted in many changes to the programs. 

F. FUTURE IMPROVEHEHTS 

Several enhancements to the request generation package 
may be desirable. Three major enhancements include the 
following; program generation of requests, simulation of 
mutltiple concurrent users, and development of a storage 
information package to aid in request selection. 
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1 • P rogr am Generatio n of Eegues t s 

In order to test HDBS, the test request generation 
package could be modified to contain a routine which gener- 
ates random requests. The input to such a routine would 
include parameters such as the percentage of each type of 
request to be generated and the the query complexity. Query 
complexity involves changing the number of predicates in the 
requests. This ability would allow the evaluation team to 
easily determine which type of request is most efficient 
under MDBS. 

2. Si mulat i on of Mul tip le Concu rrent Osers 

In order to evaluate the effect of concurrency 
control, MDBS must be tested while several users are using 
the system. By providing a way to link a user to the 
requests which are generated, the test request generation 
package would simulate mutiple users. This would avoid 
processing several separate files of requests. This would 
also result in repeatable experiments, in that the condi- 
tions resulting frcm executing the concurrent user requests 
could be duplicated. 

3* The Sto rage I nfo rmation Package 

The storage information package would allow the 
experimenter to ask specific questions about the database 
storage information so that intelligent queries can be 
derived. The questions an experimenter might ask would 
include: What descriptors are associated with a certain 

attribute? What descriptor ids define a certain cluster 
number? or where is cluster one stored? 

This package could be implemented by sending 
messages to the backends. Each message would be associated 
with a routine which walks through the directory management 
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tables and finds the appropriate information and sends it 
back to the controller. By evaluating the responses to the 
messages, more meaningful reguests can be constructed in 
order to evaluate specific features of MDBS. 



?I. AH a LI SIS Q? P ERF OH BANCE BIALO^IOH TOOLS 

In Chapter I, we discussed the study phase of creating 
the tools. In Chapter II, we discussed the design phase. 
The development phase was outlined in Chapters III, IV, and 
V. In this chapter, we discuss the operational phase. This 
taxonomy of phases is outlined in detail in [Ref. 8], More 
specifically, in this chapter, we discuss the performance 

evaluation tools with respect to several software engi- 
neering principles. 

A. BASIS OF AHALISIS 

In this section, we discuss the standards by which the 
evaluation tools are to be analyzed. The two major catego- 
ries of the analysis are the ability to meet the objectives 
stated in the design phase and the ability to meez software 
goals. The standards are described in detail in [Ref. 9] 
and [Ref. 10]. 

The ability to meet objectives means that the tool poss- 
esses the capabilities outlined in the design phase. These 
capabilities were discussed in detail in Chapter II. 

The performance evaluation tools will be evaluated also 
by their ability to meet five software goals. The first 
goal is that of modifiability. Modifiability includes the 
properties of extensibility, consistency, maintainability, 
and modularization. The second goal is that of reliability. 
Reliability includes the properties of possessing no blatent 
errors and of possessing error recoverability. The third 
goal is simplicity. This includes ease of use and single- 
ness of purpose. Efficiency is the fourth goal. A tool 
will possess this goal if it ccntains no gross inefficiency. 
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The last software goal is that of under standabiliry . 
Onderstandabili ty means that the tool utilizes abstractions, 
modularity, and information hiding, and is supported with 
reasonable documentation. 

B. AN&LTSIS OF TBE FILE 6EHEB1TI0N PACKIGE 

The objectives of the file generation package were 
discussed in Chapter II. The objective that was not met by 
this tool is the ability to indicate whether values of the 
attributes are taken from random functions or predetermined 
sets of values. The random functions must be added at a 
future date. 

The file generation package meats all goals with the 
exception of efficiency. Modifiability is achieved through 
the extensive use of modularization with respect to grouping 
like operations together. Reliability has bean observed in 
that no errors have existed since the operational phase. 
Simplicity is demonstrated by using menu-driven operations 
in the file generation package. Lastly, understandabilit y 
is achieved by religious use of abstraction of data and 
operation. The gross inefficiency in the package results 
from the use of a large array which is used to store the 
unique records which are generated. When a large number of 
records are to be inserted at one time, the time to compare 
the new record against all previously generated records is 
great. This concludes the evaluation of the test file 
generation package. 

C. ANALYSIS OF THE DATABASE LOAD SUBSYSTEM 

The objectives of the database load subsystem were 
discussed in Chapter II. The objective that was not met by 
this tool is the ability to vary the data placement 
strategy. This ability must be added at a future date. 
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The database load subsystem meets all goals with the 
exception of efficiency. Modifiability is achieved through 
the extensive use of modularization with respect to grouping 
like operations together. For instance, all of the routines 
to pass messages are grouped in send and receive modules 
which are kept in separate files. Reliability has been 
observed in that no errors have existed since the opera- 
tional phase. Simplicity is demonstrated by using menu- 
driven operations. Lastly, understan debility is achieved by 
religious use of abstraction both in the data and the opera- 
tions. The gross inefficiency in the package results from 
the use of a large number of insert requests- which are sent 
one at a time to the backends. This inefficiency could be 
reduced by grouping several insert requests into a trans- 
action and then sanding the transaction to tha backends. It 
is also possible to save all type-C descriptors in the data- 
base load subsystem and send all of them xo Insert 
Information Seneration at the end of the directory table 
loading. This concludes the evaluation of the database load 
subsystem . 

D. ANALYSIS OF THE BEQaBSI 6EMEBATI0N PACKAGE 

The objectives of the test request generation package 
were discussed in Chapter II. The objectives that were not 
met by this tool are the following enhancements: program 

generation of requests, simulation of multiple concurrent 
users, and development of a storage information package to 
aid in request selection. These abilities must be added at 
a future date. 

The test request generation package meets all goals with 
the exception of possessing consistency. Modifiability is 
achieved through the extensive use of modularization with 
respect to grouping like operations together. For instance. 



44 



all of the routines which are involved with creating a 
request are divided into modules each of which handles a 
distinct aspect of the request. This goal is seen 

throughout MDBS. Eeliability has been observed in that no 
errors have existed since the operational phase. Simplicity 
is demonstrated by using menu-driven operations. Lastly, 
underst andability is achieved by religious use of abstrac- 
tion both in the data and the operations. Consistency may 
be achieved by altering the test request generation to use 
information stored in the files generated by both the test 
file generation package and the database load subsytem. 
These files could be used for the extraction of necessary 
information instead of prompting the user to re-enter data 
supplied earlier. It is the weakest link in establishing a 
tight performance evaluation environment. This is further 
discussed in the next section. This concludes the evalua- 
tion of the database load subsystem. 

E. POTOSE DEVELOEHENTS 

The most important future development shou.ld be the 
integration of the performance evaluation tools into a 
performance evaluation environment. In this way, the prop- 
erty of consistency of the tools will be attained. That is, 
the output of one tool can be used as input to the next tool 
in the logical sequence of the performance evaluation 
effort. This has been achieved in the tesr file generation 
package-database load subsytem interface. The next step 
would be to develop consistency between the database load 
subsystem-test request generation package interface. 

This concludes the discussion on the analysis of the 
performance evaluation tools. 
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?II. COHC IDSION S 



In this thesis, we have discussed the development of the 
necessary tools for the performance evaluation of a multi- 
backend database system, known as MDBS. The basic motiva- 
tion of the mutlti-backend database system (MDBS) was to 
develop an architecture which spreads the work of the data- 
base system among multiple backends. It was a major aim of 
this system to allcw capacity growth by the use of addi- 
tional disk drives and performance improvement by the use of 
additional backends. However, to verify the design and 
implementation, it is necessary to test the capability of 
MDBS in capacity growth and performance gain. 

Three tools for the performance and capacity tests were 
investigated. The first tccl was the file generation 
package which creates test files for any artificial data- 
base, The second tool was the database load subsystem which 
loads the artificial database into MDBS. The third tool was 
the request generation package. This package created test 
requests to query MDBS. 

The following methodology was used to create an effec- 
tive tool. First, the properties of an ideal tool were 
described. Then available existing programs were reviewed 
and evaluated to determine which program best meets the 
desired features. The programs were upgraded to ensure that 
they were compatible with the current implementation, and 
met the desired features. Lastly, the tools were analyzed 
with respect to meeting the desired properties and satis- 
fying several software engineering goals. 

The main goal was to develop the necessary tools to 
generate tests in measuring the extensibility of MDBS, i.e. , 
how does MDBS perform as more backends are added? 
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DBSIGH SPECIFICATION OF THE TEST FILE GENERATION PACKAGE 



This appendix contains the design of the test file 
generation package which is a subset of the shortened data- 
base load subsystem. The design consists of C language code 
for the function headings and xheir corresponding declara- 
tions. The body of the functions are given in English text. 



/* TEST FILE V 

/* GENEEATION ♦/ 

/* PACKAGE V 

/♦ DESIGN V 



main_program () 
begin 

generate 0 ; /*generate the records*/ 

end 



generate () 

/♦ This routine 

/* - generates a record template 

/♦ - aenerat es/modif ies sets of values for attributes 

/* - generates descriptors 

/♦ - generates records using the sets 

^®^wSile ( TRUE ) 

begin 



*/ 

V 

*/ 

*/ 

*/ 



/*Ask the user for type of operation to be performed*/ 
/♦Take appropriate action*/ 

/* generate record template */ 
gen tmpl () ; 

/* "generate descriptors */ 
gen desc () ; 

/* generate/mcdif y sets */ 
gm_S€t() ; 

/* generate the records */ 
gen rec () : 

/* lead the records */ 
db load 0 ; 

/♦"do nothing*/ 

end while; 

end 
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gen_tmpl () 

This routine generates a record template */ 

begin 

char tfn (BFNLength 1); /* template-file name */ 

char c, dbid (DB1DLNTH+ 1) , hold (HA X FIELDS+1) , temtyp; 
int i» kr no attr ; 

FILE ♦fopen ()7 *tmpl_fp; 

/* Get name of template file */ 

/* Open template fxle */ 

/♦ Get database ID from the template file*/ 

/* Write database ID to template file */ 

/* Get number of attributes */ 

/* Write number of attributes to template file */ 

/* Get attributes and value types */* 

for (each attribute) 

begin 

/* Enter the attribute name*/ 

/* Enter the value type: (s=string, i=integer) */ 
end /* end fcr */ 

/* Close template file V 
end /* end gen tmpl */ 



gen_desc () 
begin 

char tfn (BFULength + 1); /* template-file name */ 
char dfn (MFNLenqth + 1); /* descr iptor-fiie name */ 
char attr name (ANLength) , 

answer (5) , desc_type, val_type, c, hold (3) ; 

int i, j, nc^attr; 

FILE *fopen 0 ^ ♦tmpl^^fp, *desc_fp; 

/♦ Get the template-file name ♦/ 

/♦ Open template file »/ 

/* Get the name of the file for storing descriptors */ 

/* Open descriptor file ♦/ 

/* Read thru Database ID to get */ 

/♦ to number of attributes */ 

/* Get number of attributes */ 

/* For each attribute get its descriptors (if applicable)*/ 

for (each attribute) 

begin 

/* Read attribute */ 

/* Get attribute name */ 

/* Get value type for the attribute */ 

/* Ask if attribute 

is to be a directory attribute*/ 
if ( answer= yes ) 
begin 

/* Write attribute name to descriptor file */ 

/* Get descriptor type for attribute ♦/ 

/* Write descriptor hype to descriptor file */ 
if ( desc type = 'C ] desc type == 'c') 
gen_C (val_type, desc fp) ; ~ 

©X S€ ^ 

gen_notCfval type, desc fp) ; 

/* write end of"data symbol to descriptor file */ 
end ~ ~ 

end /* end for */ 

/* Write end of file symbol to descriptor file */ 
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/♦ close files ♦/ 
end/*gen_desc*/ 



gen_C(v al^type, desc_f p) 

char val type; 

FILE *desc fp; 
begin ” 

char lower b (AVLength) , upperb(A7Length) , hold(3) ; 
int fault,k; 

/♦ Get upper bounds for type ’C descriptors */ 

while ( TRUE ) 

begin 

/* Get upper bound */ 
if ( end of data) 
return ; 
else 
begin 

/* Verify upper bound entry against »/ 
/* attribute value type */ 

/♦ Write NOBOOND and upper bound ♦/ 

/* to descriptor file */ 

end 

end 

end /♦ end gen_C */ 



g endnote (val_type, desc^f p) 

char val type; 

FILE ♦desc fp; 
begin ~ 

char lower b (AVLength) , upperb (A VLe ngth) , hold(3) ; 
int fault, k; 

/* Get lower and upper bounds for descriptor */ 

while ( TRUE ) 

begin 

/* Get lower bound ♦/ 
if ( end of data) 
return ; 
else 
begin 

/♦ Verify lower bound entry against */ 
/♦ attribute value type */ 

/* Write lower bound to descriptor file */ 

end 

/* Get upper bound */ 

/* Verify upper bound entry against */ 
attribute value type */ 

/♦ Write upoer bound to descriptor file */ 
end /* end while */ 
end/* end gen note */ 



gm set() 

”/♦ This routine generates/modifies sexs of values. */ 
begin 

char tfn (MFNlength + 1) ; /* template-fils name */ 
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char attr name (ANLength +1), answer, c, 7al_type, 
hcldlAVLength +1 ) ; 
char tmptyp; 
int no^attr, k, i; 

FILE ♦fopenO, ♦tmpl^fp; 



/♦ get the template-file name ♦/ 

/» Open template file ♦/ 

/♦ Get number of attributes V 

for ( each attribute ) 

begin 

/♦ Get attribute name */ 

/♦ Get value type ♦/ 

/♦ Choose the action to be taken on attribute 
n) - generate a new set for it 
m) - modify an existing set for i~ 
s) - do nothing with it */ 

switch ( answer ) 
begin 

case *n* : 

/* generate new set */ 

g en s€t( val type ) ; 
reak ; " 

case • m ' : 

mod_set(val type) ; 
break; " 
case 's * ; 

b IT ^ cL K * 

end /♦ end switch */ 
end /♦ end for */ 

/♦ Close template file V 
end /♦ end gm set */ 



gen_set (val_type) 

/♦ This routine generates a set */ 

/♦of values for an attribute. ♦/ 

ghar val type; 
begin ” 

struct definition 
begin 

char elem (SetSiz e) (AVLength + 1) ; 

/* array for holding set elements */ 
int no elem: 

/♦ number of elements in set ♦/ 

end set; 

char f ilnam (MFNLength ♦ 1) , 

• ^ ,«ii|i^y,(^VLength ♦ 1), answer(5) ; 
int k, fault, limit; 

FILE *fopen 0 , ♦tmpl_,fp; 

/♦ Get name of set file ♦/ 

/♦ Open set file */ 

/♦ Accept elements for the set */ 

while ( set is not full) 

begin 

/♦ Enter a value for the set*/ 

/* Verify set entry against attribute type ♦/ 
/* Check for set element duplication */ 
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end 



if ( s€t is full) 

/* Tell user*/ 

/♦ Write set elemerts to set file ♦/ 

/* Write end of file symbol to set file */ 
/* Close set^file ♦/ 

/* ask if user wants to modify it */ 
if ( answers yes ) 
mod set (val type) ; 
end /* en3 gen se^ */ 



mod_set (val_type) 

/♦ This routine modifies a set */ 

/* of values for an attribute. ♦/ 

char val type; 
begin 

char ofn (MFNlsngth + 1) , /* old-file name */ 
nfn (HFNIength ♦ 1) » /♦ new-file name */ 
filnam (BFNLength + 1); 

char c, answer (5), entry(A7 Length + 1) , index(5); 
int i, k, fault, j; 

struct 

begin 

int no elem; /* number of elements in the set */ 

char rem flag (SetSize) ; /♦ element removed flaa */ 
char elem(SetSize) (AVLength + 1); /* elements */ 
end set; 

FILE *fopen(), *set_fp; 

/* Get the name of the set tc be modified */ 

/* Open file ♦/ 

/♦ Read given file into array for manipulation */ 

while ( TRUE ) 

begin 

/* Ask what do you want tc perform next?*/ 

(p) - print the set elements and their indices 

la) - add some elements to the set 
(r) - remove some elements from the set 

(n) - nothino; done 

if ( answer = *p' ) 
begin 

/* Print elements of file */ 
end/* end ( answer = *p' ) */ 
else if ( answer = 'a* ) 
begin 

/* Add some elements ♦/ 

/* Check for set element duplication */ 

/* Verify entry against */ 

/* attribute value type ♦/ 

/* Add element to array if correct*/ 
end /* end ( answer = 'a* ) */ 

else if ( answer = ' r* ) 
begin 

/* Remove some elements */ 

/* Hark set elements for removal */ 

/* Re-crdei array to reflect deletions */ 
end /* end ( answer = 'r* ) */ 
else 

/* Nothing; done */ 
creak; /* exit while */ 
end /* end while ( TROE ) */ 
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/* Ask if user wants to store the modified set back 

into the original file ♦/ 

/* Write array back into file designated*/ 

/* Write end of file symbol to set file */ 

/* Close set^file */ 
end/* end mod set */ 



gen rec (\ 

7* This routine generates records using sets. */ 
begin 

c * 

char hold ( AVlength + 1) ; 
char attr_name (AVLength + 1); 

char dbid (DBIDLNTH + 1). 

gr records (MAX RECORDS) (MRLength + 1); 
char tfn (MFNLengt h 7 1), /* template-file name */ 
rfn (MFNLength + 1), /♦ record-file name */ 
vfn (MFNLength ♦ 1) ; /* temporary file name */ 

struct 

begin 

int no elem(MAX FIELDS): 

char elems (MAX TiELDS) (SetSize) ( AVLength + 1); 
end values; " 



FILE *fopen(), *tmpl_fp, *rec_,fp, *stor_fp; 



int no attr, k, i, 
re2_cnt , prcd. 



j, count 
index, o 



Id 



gr_no_rec. 



max , 



/* Get the template-file name ♦/ 

/♦ Open template file ♦/ 

/* Get file for record storage ♦/ 

/* Open record file */ 

/* Head database ID */ 

/* Write database ID to storage file */ 

/* Read number of attributes in a record ♦/ 

/* Head elements of files corresponding to */ 

/♦ each attribute into an array */ 

for ( each attribute) 

begin 

/♦ Read the attribute name */ 

/♦ Get the file name for the given attribute */ 

/* Open file ♦/ 

/* Read elements of set into array */ 

/* Close file */ 
end /♦ end for ♦/ 

/* Close template file */ 

/♦ Calculate total possible number of unique records */ 

/* Get the number of records to be generated ♦/ 

/* Determine feasibility of requested number */ 

/* Generate records by choosing (at random) */ 

/♦ a member from each of the given sets */ 

for ( each record) 

begin 

for (for each attribute) 
begin 

/* Get a value randomly from the set*/ 
end 

/♦ Give some feedback tc user cf generation effort*/ 
/♦ Check generated record for possible duplication */ 

end 

/* Write generated records to file */ 

/* Write end of file symbol to file ♦/ 

/♦ Let user Tcnow when completed*/ 
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/♦ Close file ♦/ 
end/* end gen_rec */ 



int gr_isdigit (c) 

/* This routine determines whethep ^ given »/ 
/♦ character is a digit ♦/ 

char c ; 
begin 

if ( c IS a digit ) 
return (TROEj ; 

6lS6 

return (FALSE) ; 

end 



gs_rand (num) 

/* This routine generates a random number */ 

int nu m ; 
begin 

static long seed; 
static int temp: 

seed = seed ♦ z4298 + temp ♦ time(O); 
seed = seedmod 19901 7 ; 
seed = (69069 * seed ♦ 1) : 
temp = (seed >> 8) 6 32767; 
if ( num == 0 ) 
return (temp) ; 
else 

return (temp mod num) ; 



end 
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AreENDIX B 

DESIGN SPECIFICATION OF THE DATABASE LOAD SOBSISTEH 



This appendix contains the design of the shortened data- 
base load subsystem. The design consists of C language code 
for the function headings and their corresponding declara- 
tions. The body of the functions are given in English text. 



if. ^ 

/* ♦/ 

/* Database Load */ 
/♦ Design V 

/ ♦ ♦/ 



struct rtemp^def inition template; 
db load 0 

/* This routine loads the directory tables and the database */ 
/♦ records. */ 

begin 

/* Initialize counters*/ 

/* load the directory tables */ 



end 



dbl dir tblsO ; 

/* Ioad“the database records ♦/ 
dbl records 0; 



dbl_dir tbls() 

/* THis routine loads the directory tables. */ 
begin 

char dbid (DBIDLNIH + 1), 

attrname (ANLength + 1) » 

tfn (HFNLength + 1), /♦ tern plate-file name */ 
dfn (MFNLength + 1) , /* descripxor-f ile name */ 
valtype , 
str {i f 

attrstr (DIL_AttrId +1), 
desctype; 

int at_id_no , desc_id_no; 

struct desc_def inition descriptor; 

int i, hr c; 

FILE *fopen(), *fptr; 



/* Initialize the database mailbox*/ 

/* Get the name cf the file containing */ 
/* the template information */ 

/* Read the database id */ 
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/♦ Read number of entries in the template, i.e. 
/* number of attributes in a record 
/* Read the attribute names and the value types 
/* and place the data in the template record 
for ( each attribute to be put in template) 
begin 

/♦ Read an attribute ♦/ 

/♦ Read the corresponding value type */ 

end 

/* Create attribute table fcr the database 
DBL S$Create (dbid)^; , , 

/♦ Get tne name cf the file containing the 
/* Read the directory attributes and their 
/* corresponding descriptors 
/♦ Initialize the attribute counter ♦/ 



♦/ 

*/ 



in 



bacicends ♦/ 
*/ 



descriptors 

*/ 

*/ 



while 

begin 



( not the end of data ) 

/* Read an attribute ♦/ 

/♦ Read corresponding descriptor type A,S, or c */ 
/* Add the attribute name to the arrribute table ♦/ 
cBL S$Atm insert (dbid,attrna me, Sdesctype) ; 
if Tdesctype = 'c* J desctyne == 'C') 

/♦ Send the a-^tribute ‘ to IIG */ 

DBL SSsend typeCfdbid .attrname , at_id no); 
Using fEe template, find the value */ ■" 

type for the attribute */ 

Read the corresponding descriptors*/ 
for the attribute */ 

Inititialize the descriptor id ♦/ 

( More descriptors ) 



/I 



end ^ 

/* close 
end/* end dbl 



while 
begin 

/* Get lower bound */ 

/* Get upper bcund */ 

/* Add the descriptor to DDIT */ 

DBL SSDesc add (dbid.a ttrname.&desctype, 

^descriptor .Svalty pe, at_id no,desc id no) 
/* Increment the descriptor id count '^/ 
end /♦ end while */ 
if ( desctype ! = C ) 
begin 

/* Add the catchall descriptor to DDIT */ 

DBL S$C ctchall (dbid, attrname . 

Hdesctype,at id no,desc id no); 
end /* end if */ ■* “ “ 

/* Increment the attribute count */ 

/♦ end while */ 

descriptor file */ 
dir tbls */ 



dbl^records () 
begin 
char 



struct 
int i, 



dbid (DBIDLNTH + 1 
rfn (MFNLength + 1 
req (HEQIength) , 
record (80) ; 
rtemp_def initio n 



/♦ record -file 

*tmpl ptr, 
*get_rmpl_ptr 0 ; 



name 



♦/ 



FILE *fopen(), *fptr; 

/* Get the name cf the file */ 

/* containing the records to be loaded */ 

/* Read the database id */ 

/* Get the record template for the database */ 
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while ( more records exist ) 
begin 

/♦ While there are more records */ 

/♦ Read the next one */ 

/♦ Construct a request to insert record ♦/ 
dbl construct_ins ( tmpl ptr, record, req ); 

/* 5end the request tc ‘Seguest-Preparation */ 
DBL s$TrafUnit(dbid, reg) ; 
end ~ 

end /♦ end dtl records ♦/ 



dbl_construct_ins (tnipl_ptr , record, req) 



struct , ♦tnipl_ptr; 



begin 

int i, j, k, p, entry_no; 

/♦ Load the initial part of request ♦/ 
while < not the end of the record ) 
begin 

/♦ Load the attribute name */ 

/♦ Load the attribute value */ 

end 

/♦ Load the end of request ♦/ 



char 




end 
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APPENDIX C 

DESIGN SPECIFICATION OF THE TEST REQITEST GENERATION PACKAGE 

e 

The program specification for the test request genera- 
tion and execution package is shewn in this appendix. This 
design is the result of the work of Dr. Kerr, who headed the 
design of the original test request generation package. 

The Top L evel cf T es t Req ues t Gene ra t ion Pack age 

This program can be used tc test and demonstrate MDBS. 
The execution cf this program is called a session. Each 
session can be divided into any number of subsessions. 
During a subsession the user can do one of the following: 

(A) Execute a list of requests that was previously 
stored in a file. 

(B) Prompt the user for a list of requests to be 
stored in a file for later use. 

(C) Retrieve a list cf requests that were previ- 
ously stored in a file and then allow the user to 
select requests from that list for execution. 
This selection can be done in any order. The user 
will also be able to enter a new request to be 
executed. 

(D) Modify an existing list of requests that was 
previously stored in a fils. 

In this version, requests are allowed to be grouped as 
transactions. A request is sent -to MDBS. The program waits 
for a response before sending the next request or will 
continue to execute without response if the user so desires. 

Output may be directed to the user's terminal or to a 
file or tc both. 

Progr am Sp ecifica t ion s 
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/* */ 

/* Test Bequest */ 

/♦ Generation */ 

/♦ Package Design */ 

/* */ 

^:0e4::4c #4:2^4:39(30::^ :9c :0e4e:0c4e3«e ### / 

^^slc HDES T^st* 

scalar more-sufcsessions ; /♦ flag; TRUE - continue, 

FALSE - stop */ 

Print initial message to user; 
more-sutsessions := TRUE; 

while more-subsessions do 
perform SUBSESSION; 

Prompt for continue message; 

Read continue message; 

if user dees not want to continue 
then 

more-subsessions ;= FALSE; 

end if 

end while ; 

end task ; 



procedure SUBSESSION; 



/* During a subsession the user is able */ 

/* to generate a group of requests. (NEW LIST) */ 

/* to modify an old list of requests. (MODIFY) */ 

/♦ to select requests, one at a time from a lis*: V 

/* of requests. (SELECT) ♦/ 

/* to run a group of requests. (OLD LIST) ♦/ 



scalar cu^rent-reguest-f ile ; /* The nam? of the file */ 

/♦ Initial value should be NULL. This name must be */ 

/♦ retained from one subsession to the next. ♦/ 

scalar type-of-subsession; /♦ Possible values are NEW LIST, 
MODIFY, SELECT and OLD LIST ♦/ 



Prompt for next type-of-subsession; 

Read next type-of-subsession; 

case type-of-subsession value 

NEW LIST; /♦ Enter a new request-list ♦/ 

perform NEW LIST SUB( current-request-f ile) ; 
MODIFY; /* Modify an“old list */ 

perform MODIFY SUB( current-request-file ) : 
SELECT; /♦ Select requests, one at a time, from an */ 
/♦ existing request-list */ 

? erform SELECT SDB( current-request-file ) ; 
ST; /♦ Execute an existing reguest-list */ 

perform OLD LIST SUB( current-request-file) ; 

otherwise ; Print”errcr message; 
end case ; 
end procedure ; 



procedure NEW LIST SOB ( output ; current-request -file ); 

scalar current-request-file; /♦ name of the file */ 

/♦ Asks user for requests - cne at a time. */ 

/♦ Saves list of requests in a file with file-name given by */ 
/♦ user. ^ ^ 

scalar request-list -file -name; 

/* of file tc use to store the requests */ 
record request; 
scalar next-step; 
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/* I(nsert), R(6trieve), U(pdate), D(elat9) or P(ini3h) ♦/ 

Prompt for reguast-iist-file-name; 

Read requ6st-list-f lie-name ; 

Open fil6( reguest-list-file-name ) output; 

g erforra ENTER_ AND SAVE REQUESTS ( reguest-list-f ile-na me ); 

lose file( regtiest-Tist -file-name ); 
current-request-file := request-list-file-name; 
end procedure ; 



procedure MODIFY SOB ( input/cutout ; current-reguest-file ); 

scalar current-request-file; /♦ The name of the file */ 



/* Re 
/* mo 
/♦ ch 
/* ca 
/* 

/! 

/* 

/* No 
/* th 
/* 

/* Th 
/♦ ei 
/* re 

/* 

/♦ Th 
/* ei 
/* ne 



trieve a 
dify It. 
anges to 
n Be 

add n 
modif 
remov 
make 
that 
end of 



n cld reguest-1 
Requests are 
te made to eac 



ist and then allow the user to 
examined one at a time allowing 
h request in turn. A change 



te 

€ 



ew request before this one. 
y this request, 
e this request, 
no changes to t 
we must have a 
the input requ 



his request, 
way to ap 
est list. 



wa^ to append new requests at 



e input file ( called i 
ther the current- regues 
quest file. 

e output file ( called 
ther the next version < 
w file. 



nput-re quest-file ) may be 
t-file or a different existing 

new-reg uest-f ile ) may be 
if the 1 nput- request-file or a 



*/ 

*/ 

V 

*/ 

*/ 

*/ 

*/ 

*/ 

Z 

y 

*/ 

y 

*/ 

*/ 



s calar 

scalar 

scalar 

/♦next 

record 

scalar 

scalar 

scalar 



input-request-file; /♦ The list of requests 

to be modified. ♦/ 

new-request-f ile ; /♦ The new list of requests. */ 
next- version; /* flag :TRDE-set new-reguest-f ile to */ 
version of input-request-file, FALSE-get new name*/ 
request; 

more - r eque St s-in -in put -re quest -file;/* continue flag*/ 
mcre-requests-to- enter; /* continuation flag */ 
change-type; /* ADD, MODIFY, REMOVE, or NOCHANGE */ 



scalar next-step; 

/* I(nsert), R(etrieve), U(pdate), D (elate) or F(inish)*/ 



/* Determine input-request-file to be modified. */ 

perform DETERMINE _INPUT FILEf current-request-file, 

input-request-file ); 
open file( input-request-file ) input; 



/♦Determine if user wa;its the nam»? of the new-request-f ile*/ 
/* to be the next version of the inpu t-request-file*/ 

/* or a new name.*/ 

Prompt user to determine next-version; 

Read ne xt-version ; 
if next-version 
then 

Set new-reguest- file to next version of 
input-reguest-f ile ; 

else 

Prompt for new-request-f ile name; 

Read name of new-request-fi le ; 
end if ; 

open file( new-request- f ile ) output; 

Read first request from input-reguest-file; 
more- requests-in-input-request-f lie := TROE; 
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while more-requests-in-input-re que$t-file do 
Prompt user for change-type for this request; 

Read change-type: 

case change-type value 

ADD: /* enter and save the next request ♦/ 

perform GET_ NEW REQUEST ( request ); 

Write request infc new- re quest- file ; 

MODIFY: 

Prompt and get modified request from user; 

Write new request into new-request-file : 

Read next request from input-request-file; 
REMOVE: 

Read next request from input-request-file; 

NO CHANGE: 

~ Write current request into nev-request-file; 

Read next request from input-rsquest-file ; 
otherwise : Print system error message; 
end case ; 
end while ; 

/* Note that at this ooint all the cld requests have been */ 
/♦ processed. However it is possible that the user wants */ 
/♦ to append more requests. */ 

Prompt user that input file has been processed, but that 
more requests may still be apoended; 

perform ENTER _ATO SAVE REQUE STS (new-raquesr-file) ; 

close filef input-request-f iXe ) ; 
close fila( new-request-file ); 
current-request-file := new-request-file; 
end procedure ; 



procedure SELECT SDB(input/cutput : current-reguest-file ) ; 
scalar current- request-f ile; /* The name of the file */ 



/* Retrieve an old list of requests » ♦/ 

/♦ Allow user to select from this list. ♦/ 

/* Also allow user to enter new request. */ 

scalar input-request-file; /* file ccntaining requests*/ 

array requests] MAX NUMBER, CF, REQUESTS ); 

/♦ from input-re ques^-f lie ♦/ 
scalar number-of-reguests : /* The actual number in ♦/ 
/* input-request-f ile must be less than */ 

/* MAX NUMBER CF REQUESTS */ 

scalar r equest-numBlr; /* of the request chosen */ 

record new-request: /* Provided by user. */ 

record response; /* to the request being executed. ♦/ 



scalar more-tc- execute; /* flag to control loop */ 

scalar next-operation; 

/» Values can be REQUEST NUMBER, DISPLAY,*/ 

/* NEW REQUEST or STOP — */ 

/* Determine the new input-request- file to use for */ 

/* this subsession. */ 

perform DETERMINE _INPUT_ FILE( current-request-f ile, 

input-request-file ) ; 
open( input- request-file ) ; 

Read and store input-request-file into requests checking that 
number-of-requests is less than MAX NUMBER OF REQUESTS; 
close { input- request-file ) ; “* 

perform DISPLAY ( requests ) ; 

/♦Determine whether response is to go to CRT, file or both*/ 
perform OUTHSFOEMAT; 
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mors- to -execute := TRUE; 

while more-to-execut e do 

Prompt user for next-operation /♦should be either*/ 

/♦ request-number, a request -to-display or a */ 

/* new-request */ 

Read next-operation; 

case next-operation value 
REQUEST_ NUMBER; 

ChecT? that request-number is less than 
num ber-o f-requests ; 

perform EXECUTE ( re quests (requast-numberl , 
response ) ; 

/* Output the response to CRT, file or CRT Sfile, 
as appropriate. ♦/ 
perform CUTM$SESPONSE( response ); 

DISPLAY; perform DISPLAY( requests ); 

NEW REQUEST; 

perform GET NEW_ REQUEST( new-request ); 
perform EXECUtE( new-request, response ) ; 

/♦ Output the response to CRT, file or CRT Sfile, 

as appropriate. ♦/ " 

perform OUT MSRESPONSE ( response ); 

STOP; more-to-execute ;= FALSE; 
otherwise ; print error message; 
end case ; 

end while ; 

perform OUTMSFINISH; 
current-request-file : = input-request-file; 

end procedure ; 



procedure OLD LIST SUB( current-request-file ); 

scalar current-request-file; /♦ The name of the file */ 

/♦ Retrieve and execute an old list of requests. ♦/ 

scalar input-request-file /♦ The file containing requests*/ 
record request; 

record response; /* to a request that has been executed. */ 

/* Determine the new current-request-file to use for this*/ 

/* subsession. */ 

perform DETERMINE _INPUT FILEf current-request-file, 

input-reguest-r ile ) ; 

Open( input- request- file ) input; 

Read first request from input-request-file; 

/* Determine whether response is to go to CRT, file or both. */ 
perform OUTMSFOBIIAT ; 
while more-requests do 

perform EXECUTE ( request, response ); 

/* Output the response to CRT, file or CRT Sfile, as */ 

/* appropriate. ♦/ 

perform CDTM5RESP0NSE ( response ); 

Read next request from input-request-file; 
end while ; 

perform OUTHSFINIsp: 
close ( input-request-f lie ) ; 
current-request-file ;= input-request-file; 

end procedure ; 
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procedure ENTER AND SAVE 5ZQ0ESTS 

( input ““reguest-llst-file- name ) ; 
scalar request-list -file-name; 

/* of file tc use to store the requests */ 
record request; 

S 02lX^ir 

/* I(nsert), R(etrieve), U(pdate), D (elete) or F (inish) */ 



e nd 



next-step := I; 

while next- step F do 
Prompt for next-step; 

case next-step value 

I: /♦ enter and save the 
perform INSERT S0E( req 
Write request into r 
R; /*enter and save next 
perform RETRIEVE 
Write request into r 
0: /* enter and save the 
perform DEIETE_ 
Write request into r 
D; /♦ enter and save the 

perform DEIETE 

Write request into r 
F; /♦ Finish entering re 
otherwise ; Prin 
end case ; 
while ; 



next insert request */ 
U0St ) J 

equest-list- file-name ; 
retrieve request V 

SOB ( request ) ; 

equest-list- file-name ; 

next update request */ 
SOB ( request ) : 
equest-Iist-file-name ; 

next delete request */ 
SOB ( request ) : 
equest-Iist-file-name ; 
guests */ 
t error message; 



end 
procedure 



procedure DETERMINE _INP0T FILE( input ; 

" current-reguest-file, 

output ; input-request-file ) ; 
scalar current-request- file ; 

scalar input-request -fi le; 



/* Determine the input file tc 
/* the current-request-file or 
/* request file. 



be qsed. It m^y be eii 
a different existing 



her ♦/ 
*/ 
♦/ 



scalar modify- curt ent-file -flag; 

/♦ TEOE - select new input file ♦/ 

if current- request-file is NOLL 
then 

Prompt for name of input-reguest-f ile ; 

Read name of input-request-file; 
else /♦ Determine if user wants to use the V 

/* current-request-file or a different old file. */ 
Prompt user to deter nine modif y-current-file- flag ; 
Read modify-current- file-flag; 



end 



if 



end 

end if ; 
procedure 



mcdify-curr ent-fi le-flag 
then 

Prompt for name of input-request-file; 

Read name of input- reguest-file ; 
else 

input- request-file ;= current-request-file; 
If ; 



procedure GET NEW REQ0EST( output ; request ); 

rec'Sfd “fequest; /* to be obtained from user ♦/ 
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end 



/♦ Prompts user for information necessary to enter a ♦/ 
/* new request. Returns the request. ♦/ 



scalar reguest-type; 

/* I(nsert), R(etrieve), U(pdate) or 



D(elete) */ 



Prompt for next request-type ; 

Read reguest-type; 

case request-type value 

I: perform INSERT^ SOB( request 

0; perform aPDATE_“SUB( request 

D; perform DELETE ”S0B( request 

R: perform RETRIE7T SuB ( request ) 

otherwise ; Print error message; 
end case ; 

procedure ; 



procedure DISPLAY ( input ; requests i; 

/* Display the requests and their numbers at the */ 
/♦ terminal. */ 



end 



array 
procedure 



requests ( MAX .NUMBER OF REQUESTS ); 

to b? displayed. ■*/ 



procedure EXECUTE ( input : request, 

output ; response ) ; 

/♦ Ask MDBS to execute this request. Return the response. V 
record request; /* to be executed */ 

record response; /* to the execution of the request */ 



end 



procedure 
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